Classifying and segmenting characters within an image

ABSTRACT

An image is acquired by a camera. The image has a first set of characters and a second set of characters. The first set of characters are classified as an identifier. The second set of characters are classified as data associated with the identifier. The image is divided to create an image segment. The image segment includes the first set of characters and not the second set of characters. The first set of characters are decoded in the image segment to generate a first character string. The second set of characters are decoded to generate a second character string. The first character string is linked to the second character string based on classifying the first set of characters as the identifier and the second set of characters as the data associated with the identifier.

CROSS-REFERENCES TO RELATED APPLICATIONS

This application is a continuation of U.S. Non-Provisional patentapplication Ser. No. 17/549,805, filed on Dec. 13, 2021, which claimspriority to U.S. Provisional Application No. 63/143,269, filed on Jan.29, 2021, the disclosures of which are incorporated by reference intheir entirety for all purposes.

BACKGROUND

This disclosure relates in general to a camera in a mobile device. Morespecifically, and without limitation, this disclosure relates todecoding barcodes in a scene or image using the camera in the mobiledevice. Barcodes have traditionally been scanned using a specializedscanner. For example, a barcode scanner comprising a laser is used toshine light on a barcode, and reflected light from the barcode isdetected and used to decode the barcode. As mobile devices (e.g.,smartphones and tablets) with cameras have become more common, mobiledevices are being used to decode codes by acquiring an image of a codeand using image analysis to decode the code. An example of a method forusing as smartphone to decode a barcode is provided in U.S. Pat. No.8,596,540, granted on Dec. 3, 2013.

BRIEF SUMMARY

This disclosure relates in general to applications of optical characterrecognition used to assist with the decoding of optical patterns. Mobiledevices having a camera, and being capable of hosting mobileapplications, offer a flexible and scalable solution for optical patterndecoding. However, detecting and/or decoding characters in an image(e.g., optical character recognition, OCR) can be resource intensive. Tospeed OCR in an image, in some embodiments an image can be segmented andOCR run on only the segment of the image. In some embodiments, machinelearning can be used to OCR specific classes of text (e.g., and notothers), thereby reducing computation because text not in the class isnot OCR'd. In some embodiments, a flexible pipeline can be used toprovide tailored, application-specific class identification for text toOCR. In some configurations, techniques, methods, and/or systems can beapplied to augment optical code scanning in different use cases,irrespective of the type of text to be scanned.

In certain embodiments, an apparatus for image analysis using visualgeometry as an anchor for optical character recognition comprises acamera and one or more processors. The one or more processors areconfigured to: receive an image acquired by a camera; analyze the imageto detect a location within the image having a specified geometry,wherein the specified geometry is a predefined, visual geometry; dividethe image to create an image segment, where the image segment is basedon the location of the specified geometry within the image; analyze theimage segment to detect one or more characters within the image segment;decode the one or more characters in the image segment; and/or generatea character string based on decoding the one or more characters in thesegment. In some embodiments, the specified geometry is an arrangementof rectangles.

In certain embodiments, a method for image analysis using visualgeometry as an anchor for optical character recognition comprises:receiving an image acquired by a camera; analyzing the image to detect alocation within the image having a specified geometry, wherein thespecified geometry is a predefined, visual geometry; dividing the imageto create an image segment, where the image segment is based on thelocation of the specified geometry within the image; analyzing the imagesegment to detect one or more characters within the image segment;decoding the one or more characters in the image segment; and/orgenerating a character string based on decoding the one or morecharacters in the image segment. In some embodiments, the specifiedgeometry is a class of barcodes; the specified geometry is anarrangement of rectangles; the arrangement of rectangles is aone-dimensional code (e.g., a barcode); the arrangement of rectangles isa two-dimensional code (e.g., a QR code); the specified geometry is alabel; a type of label (e.g., price label, mailing label, etc.), thespecified geometry is multiple lines of text extending a specifieddistance; the specified geometry is a symbol; and/or analyzing the imagesegment to detect the one or more characters comprises identifyingcharacters of the largest font within the image segment.

In some configurations, a method for image analysis using featuredetection for optical character recognition comprises receiving an imageacquired by a camera; analyzing the image to detect a feature, whereindetecting the feature is based on machine learning; dividing the imageto create an image segment, where the image segment is based on alocation of the feature in the image; analyzing the image segment todetect one or more characters within the image segment; decoding the oneor more characters in the image segment; and/or generating a characterstring based on decoding the one or more characters in the segment.

In some configurations, an apparatus for image analysis using a visualfeature as an anchor for optical character recognition comprises acamera; one or more sensors, in addition to the camera; and/or one ormore processors configured to: receive an image acquired by the camera;analyze the image to detect a location within the image having aspecified feature, wherein the specified feature is a visual feature;divide the image to create an image segment, where the image segment isbased on the location of the specified feature within the image; analyzethe image segment to detect one or more characters within the imagesegment; decode the one or more characters in the image segment; and/orgenerate a character string based on decoding the one or more charactersin the segment. In some embodiments, the feature is a predefined, visualgeometry; and/or detecting the feature is based on machine learning.

In some configurations, a method for image analysis using textclassification for optical character recognition comprises receiving animage acquired by a camera, wherein the image comprises a first set ofcharacters and a second set of characters; analyzing, at least a portionof the image, to classify the first set of characters as belonging to aspecified class of text, wherein the specified class is predefined;decoding the first set of characters and not decoding the second set ofcharacters, based on the first set of characters classified as belongingto the specified class and the second set of characters not beingclassified as belonging to the specified class; and/or generating acharacter string based on decoding the first set of characters. In someembodiments, the method further comprises: analyzing the image to detecta feature, wherein detecting the feature is based on machine learning;dividing the image to create an image segment, where the image segmentis based on a location of the feature in the image; the portion of theimage is the image segment; saving the character string to memory;classifying text based on features of text; using machine learning forclassification; and/or analyzing the second set of characters toclassify the second set of characters as belonging to a second specifiedclass of text, wherein the second specified class is predefined.

In some configurations, a method for image analysis using a flexiblepipeline comprises training a first engine to identify a first class oftext; training a second engine to identify a second class of text;providing a first user the first engine for detecting the first class oftext; and/or providing a second user the second engine for detecting thesecond class of text.

In some configurations, a method for image analysis using opticalcharacter recognition with barcode detection comprises receiving animage acquired by a camera; analyzing the image to detect a barcode inthe image; attempting to decode the barcode in the image; ascertainingthat attempting to decode the barcode failed; dividing the image tocreate an image segment, where the image segment is based on a locationof the barcode in the image; analyzing the image segment to detect oneor more characters within the image segment; decoding the one or morecharacters in the image segment; and/or generating a character stringbased on decoding the one or more characters in the segment.

In some configurations, a method for decoding information on a labelcomprises acquiring an image of the label wherein the label containsalphanumeric characters and one or more barcodes; decoding the one ormore barcodes; performing optical character recognition on thealphanumeric characters of the label; analyzing positions of thealphanumeric characters relative to the one or more barcodes;correlating the alphanumeric characters with the one or more barcodes;and/or reporting the alphanumeric characters correlated with the one ormore barcodes.

Further areas of applicability of the present disclosure will becomeapparent from the detailed description provided hereinafter. It shouldbe understood that the detailed description and specific examples, whileindicating various embodiments, are intended for purposes ofillustration only and are not intended to necessarily limit the scope ofthe disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

The present disclosure is described in conjunction with the appendedfigures.

FIG. 1 depicts an example technique for automated recognition anddecoding of a pattern in an image containing multiple patterns, inaccordance with some embodiments.

FIG. 2 depicts an embodiment for using optical character recognition asa backup to barcode scanning.

FIG. 3 illustrates a flowchart of an embodiment of a process for imageanalysis using optical character recognition with barcode detection.

FIG. 4 illustrates examples of optical pattern labels encodinginformation in multiple formats.

FIG. 5 illustrates exemplary labels for scanning, in accordance withsome embodiments.

FIG. 6 illustrates a flowchart of an embodiment of a process for imageanalysis using text classification for optical character recognition.

FIG. 7 illustrates exemplary shipping labels in multiple structures andinformation formats, according to some embodiments.

FIG. 8 illustrates example cases where a system may be configured foruse-case specific scanning combining OCR and optical pattern scanning.

FIG. 9 illustrates various approaches to integration of OCR-opticalpattern scanning, in accordance with some embodiments.

FIG. 10 illustrates a flowchart of an embodiment of a process for imageanalysis using a flexible pipeline.

FIG. 11 illustrates example reporting marks on rail stock.

FIG. 12 illustrates examples of numbering systems on rolling stock andshipping containers.

FIG. 13 illustrates a flowchart of an embodiment of a process for imageanalysis using visual geometry as an anchor for optical characterrecognition.

FIG. 14 illustrates a flowchart of an embodiment of a process for imageanalysis using machine-leaning feature detection for optical characterrecognition.

FIG. 15 illustrates a flowchart of an embodiment of a process for imageanalysis using feature detection for optical character recognition.

FIG. 16 depicts a block diagram of an embodiment of a computer system.

In the appended figures, similar components and/or features may have thesame reference label. Further, various components of the same type maybe distinguished by following the reference label by a dash and a secondlabel that distinguishes among the similar components. If only the firstreference label is used in the specification, the description isapplicable to any one of the similar components having the same firstreference label irrespective of the second reference label.

DETAILED DESCRIPTION

The ensuing description provides preferred exemplary embodiment(s) only,and is not intended to limit the scope, applicability, or configurationof the disclosure. Rather, the ensuing description of the preferredexemplary embodiment(s) will provide those skilled in the art with anenabling description for implementing a preferred exemplary embodiment.It is understood that various changes may be made in the function andarrangement of elements without departing from the spirit and scope asset forth in the appended claims.

Examples of optical patterns include 1D barcodes, 2D barcodes, numbers,letters, and symbols. As scanning optical patterns is moved to mobiledevices, there exists a need to increase scanning speed, increaseaccuracy, and/or manage processing power. Interpreting an opticalpattern (e.g., scanning for an optical pattern) can be divided into twosteps: detecting and decoding. In the detecting step, a position of anoptical pattern within an image is identified and/or a boundary of theoptical pattern is ascertained. In the decoding step, the opticalpattern is decoded (e.g., to provide a character string, such as anumerical string, a letter string, or an alphanumerical string). Asoptical patterns, such as barcodes and QR codes, are used in many areas(e.g., shipping, retail, warehousing, travel), there exists a need forquicker scanning of optical patterns. In some embodiments, opticalpatterns can include alpha and/or numerical characters. The followingare techniques that can increase the speed, accuracy, and/or efficiencyof scanning for optical patterns. The following techniques can be usedindividually, in combination with each other, and/or in combination withother techniques.

FIG. 1 depicts an example technique for automated detection and decodingof one or more optical patterns in an image, in accordance with someembodiments. In FIG. 1 , a system 100 (e.g., a mobile device) comprisesa display 110 and a camera. The camera has a field of view (FOV) of areal scene. The camera is configured to capture an image 112 of the realscene. The real scene contains one or more optical patterns 114.

The camera can capture a plurality of images. The plurality of imagescan be presented in “real time” on the display 110 (e.g., presented onthe display 110 in a sequential manner following capture, albeitpotentially with some latency introduced by system processes). The image112 is one of the plurality of images. The plurality of images depictthe real world scene as viewed through the field of view of the camera.The real world scene may include multiple objects 150, patterns, orother elements (e.g., faces, images, colors, etc.) of which the opticalpatterns 114 are only a part. FIG. 1 depicts a first optical pattern114-1 and a second optical pattern 114-2, among other optical patterns114.

The image 112 may be captured by the camera and/or provided viaadditional or alternative system processes (e.g., from a memory device,a communications connection to an online content network, etc.). Theoptical patterns 114 are detected and/or recognized in the image 112.Detection and recognition of optical patterns may describe differentapproaches for image analysis of optical patterns. Detection maydescribe detecting an optical pattern in an image by characteristicdiscrete patterns (e.g., parallel bars or symbols). Recognition mayinclude additional analysis of the pattern that provides descriptiveand/or characteristic information (e.g., an optical pattern type),specific to the optical pattern, but does not necessarily includedecoding the optical pattern. For example, a barcode may be detected inan image based on image analysis revealing a region of the imagecontaining multiple parallel bars. After additional analysis, thebarcode may be recognized as a UPC code. In some embodiments, detectionand recognition are concurrent steps implemented by the same imageanalysis process, and as such are not distinguishable. In someembodiments, image analysis of optical patterns proceeds from detectionto decoding, without recognition of the optical pattern. For example, insome embodiments, an approach can be used to detect a pattern ofcharacters, and in a second step decode the characters with opticalcharacter recognition (OCR).

Detecting optical patterns 114 permits automatic (e.g., without userinteraction) generation and/or presentation on the display 110 of one ormore graphical elements 122. In some embodiments, the graphical elements122 may include, but are not limited to highlighted regions, boundarylines, bounding boxes, dynamic elements, or other graphical elements,overlaid on the image 112 to emphasize or otherwise indicate thepositions of the optical patterns 114 in the plurality of images. Eachoptical pattern 114 may be presented with one or more graphicalelements, such that a user is presented the positions of the opticalpatterns 114 as well as other metadata, including but not limited topattern category, decoding status, or information encoded by the opticalpatterns 114.

The system 100 may identify one or more of the optical patterns 114 fordecoding. As mentioned above, the decoding may be automated,initializing upon detection of an optical pattern 114 and successfulimplementation of a decoding routine. Subsequent to detection and/ordecoding, object identifier information, optical pattern status, orother information to facilitate the processing of the optical patterns114 may be included by a graphical element 122 associated with anoptical pattern 114 that is decoded. For example, a first graphicalelement 122-1, associated with the first optical pattern 114-1, may begenerated and/or presented via the display 110 at various stages ofoptical pattern detection and/or decoding. For example, afterrecognition, the first graphical element 122-1 may include informationabout an optical pattern template category or the number of patternsdetected. Following decoding, the first graphical element 122-1 maypresent information specific to the first optical pattern 114-1. For anoptical pattern 114 that is detected, but decoding is unsuccessful, thesystem 100 may alter a graphical element 122 to indicate decodingfailure, as well as other information indicative of a source of theerror. As an illustrative example, a second graphical element 122-2 mayindicate that the second optical pattern 144-2 cannot be decoded by thesystem 100, for example, through dynamic graphical elements or textualinformation. For example, the second graphical element 122-2 is a yellowbox surrounding the second optical pattern 114-2 after the secondoptical pattern 114-2 is detected; the second graphical element 122-2 ischanged to a red box if the second optical pattern 114-2 is not decoded,or is changed to a green box if the second optical pattern 114-2 isdecoded. Examples of graphical elements used during detecting anddecoding optical patterns can be found in U.S. application Ser. No.16/905,722, filed on Jun. 18, 2020, which is incorporated by referencefor all purposes. Optical patterns can also be tracked, as described inU.S. patent application Ser. No. 16/920,061, filed on Jul. 2, 2020,which is incorporated by reference for all purposes.

A. OCR Support Scanning

FIG. 2 depicts an embodiment for using optical character recognition asa backup to barcode scanning. In FIG. 2 , an image 202 of a label 204containing a vehicle identification number (VIN) is shown. The VINincludes both a barcode 208 and an alphanumeric code 212. Thealphanumeric code comprises a set of characters. A character can be aletter, a number, or a character symbol (e.g., “/”, “,”, “.”, “;”, “!”,“@”, “#”, “$”, “%”, “{circumflex over ( )}”, “&”, “*”, “<”, “>”, “+”,“−”, “=”, etc.). In some embodiments, a character symbol is limited towhat can commonly be typed on a personal computer.

A system, (e.g., system 100 in FIG. 1 ), scans the alphanumeric code 212if the barcode 208 isn't decodable (e.g., the barcode is damaged, torn,blocked, washed out by glare, or missing). Scanning the alphanumericcode 212 can include one or both of the following to facilitate a textscan: (i) a user interaction; (ii) automatically detecting that thebarcode 208 can be localized but not scanned, or automatically detectingthat the barcode 208 is missing.

In some embodiments, text recognition can be used as an alternative datacapture capability where optical code scanning fails. Scenarios for thisinclude, but are not limited to damaged barcodes, partially visiblebarcodes, removed barcodes, products and objects that aren't fullylabeled with barcodes, barcodes that are too blurry due to distance,where text may require less resolution in certain situations, and/orbarcodes with printing issues.

In some embodiments, OCR support scanning is an add-on to a barcodescanner, such that barcode scanning is the first approach applied, orboth OCR and barcode scanning are implemented concurrently.

In some embodiments, a system implementing OCR support scanning maylearn optical pattern structures from past scans to automatically definewhat strings to scan/select. In this way, the system may detect “broken”barcodes with barcode localization and trigger OCR scans close to thelocalized barcode. With OCR support scanning, exception handling may bereduced for processes where barcodes can be damaged during regularhandling of a product or other goods.

Alphanumeric, human readable representations of an optical pattern maybe used to assist with the decoding of the barcode. In some embodiments,an optical pattern is only partially decoded, for example due tophysical damage in some area of the barcode. Subsequently, thealphanumeric number can be recognized via OCR and the missinginformation from the barcode can be added using the decoded alphanumericcharacters.

In some embodiments, the decoding of the alphanumeric characters is usedto detect potential errors with the decoding of a barcode. For example,if the two decoded strings are not identical, there is a possibilitythat the barcode was not correctly decoded. This can be used to reduceor eliminate errors in barcode decoding (e.g., false positives). Forexample, it is possible to repeat the barcode scanning attempt and/orthe OCR attempt until the two results match.

In some embodiments, OCR support scanning may include, but is notlimited to: (i) receiving a manual selection of text; (ii) enablingimproved OCR integration with the optical code scanner; (iii)automatically selecting one of many fields that match a registration orother identifier; (iv) automatically locating text that is near anoptical pattern, and scanning the text if localized barcode is toodamaged to be decoded; (v) receiving a manual selection of a barcodethat cannot be scanned for text scanning with OCR; (vi) receiving amanual selection of a single or of multiple text objects by type of textvisible in a region of an image; (vii) automatically scanning textaccording to patterns defined in the application or automaticallyrecognized by the scanner based on previously scanned barcodes; (viii)fixing an area of an image for OCR; and/or (ix) receiving a manualselection of text in an image (e.g., a portion of an image or anywherein an image).

In some embodiments, OCR may be automatically triggered. For example, asystem may automatically detect that an optical code has been found butnot scanned (with localization). Additionally and/or alternatively, ifan optical code is found but not decoded, a system may detect andrecognize text next to optical patterns based on proximity in an image.Configuration of OCR may also be automatic. For example, a system mayautomatically learn an optical pattern format from previous scans toconfigure OCR text pattern and length.

In the embodiment shown in FIG. 2 , an image of the label 204 isacquired by a camera of a mobile device. The image is analyzed to detectthe barcode 208 in the image. An attempt is made to decode the barcode208 (e.g., either locally on the mobile device or at a remote serverbased on an uploaded image from the mobile device). However, the system(e.g., the mobile device and/or a remote server) ascertains that thatthe barcode 208 cannot be decoded (e.g., receives an error message). Forexample, there are lines through the barcode 208, rendering the barcode208 unreadable by some systems and/or algorithms. After ascertaining thebarcode 208 cannot be decoded, the system scans the image 202, or scansa portion of the image (e.g., an image segment), to decode thealphanumeric code 212. For example, the image 202 could be divided tocreate the image segment. An image segment is a portion of the image.The image 202 is divided based on the location of the barcode 208 in theimage 202. The image 202 can be divided based on a dimension of thebarcode 208. For example, the image segment could be defined by an areathat is as wide as the barcode 208 and n times the height of thebarcode, wherein n is equal to or greater than 2 or 3 and equal to orless than 2, 3, 4, or 5. In FIG. 2 , the image segment, segment 216, istwice as tall as the barcode 208 (e.g., segment 216) and bottomjustified on the barcode 208 because the alphanumeric code 212 isexpected to be above the barcode 208. Length is not changed because thealphanumeric code 212 is expected to be above or below the barcode 208,and not to the side of the barcode 208). In another example, the imagesegment, segment 220, is defined as an area in relation to (e.g., above)the barcode 208 (e.g., a height of the image segment is k times theheight of the barcode, wherein k is equal to or greater than 0.25, 0.5,0.75, 1, 1.5, or 2 and/or equal to or less than 0.5, 1, 1.5, 2, 5, or10).

In some embodiments, the image segment includes the barcode 208 and anarea adjacent to (e.g., contiguous with) an area of the barcode 208(e.g., the image segment could be the area of the barcode 208 plus thearea of segment 220) as shown by segment 216. A person skilled in theart will recognize various configurations based on characters positionedin relation to an optical code (e.g., characters could be above, below,left, right, embedded, or partially embedded in a barcode).

On a price label, a description of a product could be separated a knowndistance from a location of barcode. Accordingly, in some embodiments,an area of the image segment is not contiguous with an area of thebarcode. Since a location of the barcode is used to segment an image foroptical character recognition, it can be said that the barcode is ananchor for OCR, because the barcode can provide position information todefine an image segment location and/or size. Though a barcode can beused as an anchor for OCR, there can be other anchors for OCR.

By running optical character recognition on the image segment, and noton the entire image, OCR can be run more quickly and/or not use as muchcomputational resources. For example, an optical-code algorithm could berun to try to decode the barcode 208 in FIG. 2 . The optical-codealgorithm could be run on the mobile device. After the mobile deviceascertains that the barcode 208 cannot be decoded, the mobile devicedivides the image to create an image segment (e.g., segment 216 orsegment 220). An OCR algorithm is then run on the image segment. The OCRalgorithm can be run on the mobile device or using a remote device, suchas a remote server or a personal computer. Not only does running the OCRalgorithm on the image segment take less computation resource thanrunning the OCR algorithm on the entire image, but transmitting theimage segment to a remote device does not take as much resources (e.g.,bandwidth) as transmitting the entire image.

FIG. 3 illustrates a flowchart of an embodiment of a process 300 forimage analysis using optical character recognition with barcodedetection. In this embodiment, OCR is used as backup to barcode reading.For example, if a barcode is damaged, then OCR is used.

Process 300 begins in step 304 with detecting and attempting to decode abarcode in an image. In some embodiments, step 304 comprises receivingan image acquired by a camera. For example, the image of FIG. 2 isacquired by the system 100 in FIG. 1 . The image is analyzed to detect abarcode in the image. For example, the image 202 in FIG. 2 is analyzedto detect the barcode 208. The systems attempts to decode the barcode inthe image. For example, the system 100 in FIG. 1 attempts to decode thebarcode 208 in FIG. 2 .

In step 308, the system fails at decoding the barcode in the image, andthe system ascertains that the attempt to decode the barcode failed. Forexample, a decoding algorithm is run on a portion of the imagecontaining the barcode 208 in FIG. 2 , but the system is unable tointerpret the barcode 208, or able to decode only a portion of thebarcode 208. Thus an area is identified as containing a barcode, but thedecoding algorithm cannot decode the barcode 208.

In step 312, the image is divide to create an image segment. Forexample, the image in FIG. 2 is divided to obtain segment 216 or segment220. The location and/or size of the image segment is based on alocation and/or a size of the barcode (e.g., as discussed in conjunctionwith FIG. 2 ).

One or more characters are decoded within the image segment, step 316.The image segment is analyzed to detect the one or more characters. Forexample, an algorithm is run on segment 220 in FIG. 2 to detect thealphanumeric code 212. If the alphanumeric code 212 is detected, then anOCR algorithm is run on the image segment to decode the alphanumericcode 212. In some embodiments detecting the one or more characters isomitted and the OCR algorithm is run to the image segment.

A character string is generated based on decoding the one or morecharacters in the image segment. For example, a character string of theVIN is generated after decoding the alphanumeric code 212.

B. Smarter Label Scanning and Barcode Semantics

FIG. 4 illustrates examples of optical patterns on labels encodinginformation in multiple formats, including alphanumeric characters andbarcodes. In some embodiments, a system may scan labels that includetext fields and barcodes. Scanning may include, but is not limited to,price label scanning. Price labels often contain a barcode, a price, anda text based description of the product. Other examples include shippinglabels and product labels, especially labels on packaged consumer goods.

Text and barcode localization and recognition may permit a system todetect and determine meta-data describing a barcode. For example,whether a barcode describes a model number vs a serial number. As anillustrative example, product and shipment labels often use multiple 1Dor 2D barcodes to encode relevant information, such as product number,serial number, batch and lot number, and/or expiry date. While thereexist standards to encode not only the data but also the semantics ofthe data (e.g., GS1 AIs, Automotive standards such as VDA, AIAG Odette),labels may omit key-value encoding, relying on human readable strings inthe proximity of the barcode to explain the semantics of each individualbarcode.

Where a label includes multiple barcodes, manual scanning may includeaiming a scanner at a particular barcode (or select it via tapping forcamera-based barcode scanning) based on the human readable informationon the label. This process may include a learning curve and may beerror-prone and time-intensive (e.g., due at least in part to theprecision required).

In some embodiments, optical patterns can be assigned a key/labelwithout reading a human readable key because they follow a certainstandard format or even particular encoding. Examples include IMEI, UPC,EAN, GS1 AI, HIBCC patterns. Similarly, optical patterns can be assigneda key/label without reading a human readable key because they include acertain prefix (e.g., VDA labels). Where a human readable key is in thevicinity of a barcode, the human readable key may be the closest textstring, but the closest text string to the barcode is not always thecorrect key. Therefore, a system may include some or all barcodes on thelabel. A system (e.g., an untrained system) may assign to each opticalpattern the corresponding key/semantics so that a system may scan theentire label, including optical patterns, and the software applicationcan automatically (e.g., without manual intervention) retrieve thebarcode number(s) that are desired for a particular application context,e.g., serial number. In some embodiments, an untrained system is asystem that has not seen the particular label layout before (e.g., andinfers semantics of the label from barcodes and/or text strings presenton the label; different from an approach where a system is explicitlytold what each barcode on a label means).

In some embodiments, a system may identify and support a predefined setof key strings, which may get updated over time. Examples of key stringsinclude, but are not limited to serial numbers (e.g., SN, S/N, Serial)and IMEI. In some embodiments, the system may also “label” barcodes forwhich there are no predefined set of key strings.

FIG. 4 depicts an image 402. The image 402 includes a label 404comprising a barcode 406, a description 408, and a price 410. Thedescription 408 can be classified as a description because of itsposition relative to the barcode 406 and/or because it contains multiplelines of text. The price 410 can be classified as the price because ofits position relative to the barcode 406 and/or because it is in thelargest font. The image 402 is divided to form a first segment 412-1 anda second segment 412-2. The first segment 412-1 is decoded to generate afirst character string. The second segment 412-2 is decoded to generatea second character string. The first character string is classified as adescription, and the second character string is classified as a price.If only the price 410 is desired, then an OCR algorithm is run on onlythe second segment 412-2 and not the first segment 412-1, because thefirst segment 412-1 does not contain characters classified as price.

The barcode 406 is decoded and information about the EAN number, theproduct, and a catalog price are obtained (e.g., either from the barcodeitself or from a database/catalog by matching the EAN number to an entryin the database). The price 410 is decoded using OCR and compared to thecatalog price. If the catalog price is different from the price 410 onthe label 404, then an error message can be generated to update label404. Similarly, the description 408 can be compared to a catalogdescription. Accordingly, barcode information (e.g., catalog price) iscombined with and/or compared to OCR information (e.g., price 410). Thiscan be useful for label verification.

In some embodiments, barcode semantics are used to classify a barcode.For example, label 424 comprises five barcodes. If the IMEI was desiredto be scanned, the IMEI could be one of five barcodes on label 424. Animage of label 424 is segmented into five segments 428, including afirst segment 428-1, a second segment 428-2, a third segment 428-3, afourth segment 428-4, and a fifth segment 428-5, based on locations ofbarcodes and/or based on information that “IMEI” is above the barcode tobe decoded. The system runs an optical character recognition algorithmon the segments 428 to identify which segment contains “IMEI” andreturns the fifth segment 428-5 as containing “IMEI.” The barcode in thefifth segment 428-5 is then decoded to obtain and/or confirm the IMEInumber.

Some may question why the barcode in the fifth segment 428-5 is decodedeven though OCR of the fifth segment 428-5 can return a number for theIMEI. The barcode in the fifth segment 428-5 can be decoded for severalreasons. For example, OCR usually has little or no error checks, whereasbarcodes are usually designed to provide error analysis to confirm thebarcode was correctly decoded. Accordingly, OCR of the IMEI number couldreturn a “0” for a “6” or the letter l′ for the number “1”. Searchingfor a predefined string using OCR, or a string that is “close enough”(e.g., matched using a Levenshtein distance; or using common errorsubstitutions, such as a number “1” for the letter “I”) or has thehighest probability of a limited set to match the predefined string, anddecoding the barcode to obtain the number, is usually more reliable(e.g., less prone to error) than simply relying on OCR to obtain thenumber.

In some embodiments, two SKU barcodes are decoded and classified, apromotion is identified (e.g., by color of label), a type of promotionis identified (e.g., final clearance by a star on the label or the lasttwo digits of a price being a certain number, such as “97”), and/orprice is identified. Thus a label can be detected and recognized (e.g.,the label is the anchor), and then multiple objects on the label aredecoded (e.g., either by OCR or barcode algorithm, by searching forobjects in well-defined spots, font size, font color, etc.).

In FIG. 4 , a label 444 is shown. A symbol 448 is on the label 444. Thesymbol is of an hourglass. To read an expiration date on the label 444,the label 444 is scanned to search for the symbol 448. In someembodiments, the symbol 448 is not a barcode or character (e.g., notfound on a standard computer keyboard). An image of the label 444 isthen divided to obtain segment 452. A position of segment 452 in theimage is configured so that the symbol 448 is in an upper, left-handportion of segment 452. An OCR algorithm is then run in the segment 452to obtain a date. For example, “2016-09-11” is obtained. The text “USEBY” in segment 452 can be used to confirm the segment is in the correctlocation of the image, and/or the text “USE BY” is simplyignored/discarded.

In some embodiments, semantics are ascertained by decoding a barcode andcomparing barcode content with OCR data. In the example described abovewith label 424, the system is configured to associate text above abarcode with the barcode. However, in some embodiments, a system may notknow which text is associated with which barcode. For example, it mightbe challenging for a system to ascertain whether the barcode above orthe barcode below “Serial No.” on label 424 is associated with theserial number (e.g., since the system might not know if accompanyingtexts are situated above or below the barcodes; which is something thatmight vary for different label types). For example, the accompanyingtext to the barcode shown in the second segment 428-2 could by “SerialNo.” or “CSN.” To ascertain which text is associated with a barcode, abarcode is decoded and a line of text (or multiple lines of text) isread (e.g., “Serial No. 12345678901234”). A first part of the text(e.g., “Serial No.”; ascertained by dividing the text into a firststring of letters and a second string of numbers) is used to find outthe semantics. A second part (e.g., “12345678901234”; the second string)is used to match the text to a barcode, since the content of the barcodeshould also be “12345678901234.” In some embodiments, the barcode isdecoded, even though the numbers “12345678901234” can be obtained byoptical character recognition because some of the decoded text mightinclude some wrongly read characters. Accordingly, matching the secondpart of the text to the barcode content is performed using anapproximation. For example, a Levenshtein distance between the contentof the barcode and the second part of the text could be used. In anotherexample, the system counts a first number of digits in the content ofthe barcode, counts a second number of digits in the second part of thetext, and compares the first number of digits to the second number ofdigits. Since “Part No.”, “Serial No.”, and “CSN” on label 424 havedifferent numbers of digits, text can be efficiently associated withbarcodes based on a number of digits in the text, or in a part of thetext.

C. Smarter Product Label Scanning

With capabilities in text recognition, text localization, labellocalization, and/or optical pattern scanning, more complex embodimentscan be enabled, which can include both localizing and scanning specifictext fields as well as optical patterns and also incorporate verticalspecific data format standards (e.g., on medical device labels). In someembodiments, a system may scan data from multi barcode labels (e.g.,electronic devices+packages, medical devices, or asset tracking labels).

FIG. 5 illustrates images 502 of exemplary labels 504 for scanning. Someplatforms may scan multiple barcodes without knowing the specificcharacteristics to uniquely identify a semantic of the barcode (e.g.,SKU vs serial number), serial number, lot number, batch number,manufacturing date, or expiry date, which may be printed as text but notencoded in an optical code. Often, information is limited to a singleline or a single word. The system (e.g., system 100 in FIG. 1 ) mayaugment a) existing barcode scanners with the capability to scanmultiple barcodes at once; b) existing OCR capabilities; and/or c)technological components that combine OCR+barcode scanning for scanningprice labels in retail stores. In some embodiments, labels 504 may bescanned in conditions without a reliable internet connection (e.g., someembodiments run on a user's device and/or transmit data to a localdevice using Bluetooth or near field communication).

Applicant has observed that in certain situations, scanning a particularbarcode can be challenging. For example, FIG. 5 shows a first image502-1 and a second image 502-2. The first image 502-1 is of a firstlabel 504-1. The second image 502-2 is of a second label 504-2. Both thefirst label 504-1 and the second label 504-2 each comprise severalbarcodes 508.

The first image 502-1 is of a phone box. When a phone is sold, theretail agent scans the barcode of a serial number of the phone forpurchase. However, with multiple barcodes, some image-based, barcodescanning software is not able to correctly identify which barcode is theparticular barcode that relates to the serial number.

In some embodiments, to identify a particular barcode, the systemdecodes each barcode 508 in an image 502, or portion of an image (e.g.,an image segment). For example, the first image 502-1 is divided tocreate a segment 516, which includes the first label 504-1. Each barcodewithin the first segment 516 is decoded to produce a plurality of scanresults. The plurality of scan results are then compared to UPC codes toidentify which UPC codes are used. The scan result that is a UPC codethat matches a UPC code for a serial number is selected (e.g., based onsymbology and/or length of the serial number). If the product is notknown, the screen freezes and the user can select the serial number fromundefined barcodes scanned.

However, if there is more than one barcode per type of barcode, then ascanning result can still be ambiguous using the method in the paragraphabove. For example, a barcode for an International Mobile EquipmentIdentity (IMEI) can be a code 128 barcode. However, code 128 barcodesare used in many applications, including other IMEI barcodes (e.g., seealso Wikipedia, Code 128, available at:https://en.wikipedia.org/wiki/Code_128). Accordingly, there can be othercode 128 barcodes on a package.

To select a particular barcode from multiple of the same type (e.g., toselect an IMEI number from multiple code 128 barcodes in an image), eachbarcode of the same type (e.g., of a particular type, such as code 128barcodes), are tracked and scanned (e.g., using techniques as disclosedin commonly owned U.S. patent application Ser. No. 17/244,251, filed onApr. 29, 2021, which is incorporated by reference for all purposes).Other optical patterns are ignored. The system determines the barcodethat is to be scanned based on a position of the barcode (e.g., IMEI1 isabove IMEI2) and not just barcode type (e.g., symbology and/or length).In some applications, some phones have multiple IMEI numbers (dual-simphones), and the user would like to automate scanning the correct one(based on the UPC code of the phone). In some configurations, anapplication allows a user to create capture patterns which are bound tospecific UPC/EAN barcode values. Based on the UPC/EAN code it is decidedwhich of the multiple IMEI barcodes (Code 128) is used to fill the IMEI1field.

In some embodiments, an implementation of a mobile-device app comprisesone or more of: scanning each barcode in an image; identifying UPC/EANcodes; if UPC/EAN is identified, scan relevant IMEI and display result,with or without displaying the serial number of the device as well. Ifthe UPC/EAN is not identified and only one IMEI code is found, then apprequests the user to confirm that the device is single-IMEI device, andsave this information accordingly; if UPC/EAN is not identified andmultiple IMEI are found, then the app can freeze an image on the screenand/or request the user to select an IMEI; The IMEI is saved (e.g., bysorting both IMEI+UPC on X and Y axis and save index of selectedbarcode).

In some embodiments, a barcode semantics solution is implemented. Abarcode semantics feature label capture can be used to automaticallyscan the correct barcode without having to maintain templates (e.g.,relations between barcodes). In some configurations, semantic type islooking for a specific type of character. For example, optical characterrecognition is run on fields near barcodes to identify the characters(e.g., identifying “IMEI” and matching a barcode to the characters).

It is possible to define a label capture template that specifiesrequired and optional fields (e.g., UPC, IMEI1, and IMEI2) and how theycan be identified. For example, a label contains four code 128 barcodeswith different semantics: an EID code, an IMEI2 code, a serial number,and an IMEI/IMEID code. The EID code and the serial number could berequired, and the IMEI2 code and the IMEI/IMEID code could be optional.In some embodiments, a global barcode semantics field is not used and aname of a barcode is derived based on recognizing text near a barcode.

In some embodiments, an implementation of a mobile-device app usingbarcode semantics comprises: scan all barcodes, pick UPC/EAN; if UPC/EANis known, scan relevant IMEI and display result (no serial numbershown); if UPC/EAN is unknown, use barcode semantics to identify IMEI 1(and IMEI 2 to exclude it); save selected IMEI (e.g., by sorting bothIMEI+UPC on X and Y axis and save index of selected barcode). In someembodiments, a user is requested to confirm that the right barcode hasbeen identified (e.g., by displaying an icon over the barcode on adisplay of a mobile device).

Semantic scanning can be used where there is a barcode and a lettercombination associated with the barcode. For example, the lettercombination could be “serial”, “ISBN”, “IMF”, “sim”, etc. The lettercombination tells the system what the barcode is. This can be used tocreate a template and/or to classify a barcode.

As an example, the first image 502-1 is acquired by a camera of a mobiledevice. The image is cropped to the segment 516. Barcodes 508 in thesegment are detected and the segment 516 is further subdivided intosecondary segments, with one secondary segment per barcode 508. Forexample, segments 428 in FIG. 4 could be secondary segments. OCR is runon the secondary segments to obtain barcode identifiers. One or morebarcodes are decoded and matched (e.g., correlated) with a barcodeidentifier (e.g., based on proximity of text to a barcode). Table 1below provides a sample matching between barcode identifiers, obtainedusing OCR, and strings obtained by decoding barcodes 508 in the firstimage 502-1.

TABLE 1 Matching OCR fields with barcodes Barcode identifiers (OCR)Strings (from barcodes) SKU 610214630421 Sim Serial 8004360561510603858IMEI 353914054176312 TGT 170-7663

Barcode identifiers are obtained by optical character recognition.Strings are obtained by decoding a barcode. Barcode identifiers areassociated (e.g., linked) with strings based on proximity. Thistechnique can work particularly well for highly structured labels (e.g.,where the barcode identifier is known in relation to the barcode).Dividing the image to define segments is sometimes referred to assegmentation or text localization.

In another example, OCR can be used twice, first to locate identifiers,and then second to decode text. For example, FIG. 5 shows a third label502-3. The third label 502-3 includes three identifiers 520, “Lot”,“Date”, and “Pret”; and the third label 502-3 includes three values 524,“3445” corresponding to Lot, “24.06.2019” corresponding to Date, and“12.56” corresponding to Pret. A barcode is detected. A firstsegmentation of the image of the third label 502-3 to the right of thebarcode in made because it is known that identifiers are to the right ofthe barcode. A first OCR algorithm is run to locate the identifiers 520.The image is then segmented a second time based on locations of thethree identifiers (e.g., further segmented into three secondarysegments; one to the right of Lot, one to the right of Date, and one tothe right of Pret). A second OCR is run on each of the secondarysegments to generate strings that are associated with the identifiers,as given in Table 2 below.

TABLE 2 Matching OCR fields with other text Identifiers (OCR) Strings(from text) Lot 3445 Date 24062019 Pret 12.56

In some embodiments, a method for decoding information on a labelcomprises: acquiring an image of the label wherein the label containsalphanumeric characters and one or more barcodes; decoding the one ormore barcodes; performing optical character recognition on thealphanumeric characters of the label; analyzing positions of thealphanumeric characters relative to the one or more barcodes;correlating the alphanumeric characters with the one or more barcodes(e.g., as shown in Table 1 above); and reporting the alphanumericcharacters correlated with the one or more barcodes.

FIG. 6 illustrates a flowchart of an embodiment of a process 600 forimage analysis using text classification for optical characterrecognition. In some embodiments, text is classified before runningoptical character recognition. In some configurations, an image issegmented before classification and/or during classification.Classification can be based on features of text (e.g., largest textwithin a segment is classified as price). In some embodiments, machinelearning is used for classification (e.g., to identify price labels).

Process 600 begins in step 604 with receiving an image comprising afirst set of characters and a second set of characters (e.g., price 410and description 408 in FIG. 4 . The image is acquired by a camera (e.g.,from a camera of a mobile device).

In step 608, at least a portion of the image (e.g., one or more imagesegments or an entire image) are analyzed to classify the first set ofcharacters as belonging to a specified class of text, wherein thespecified class is predefined (e.g., serial number, price, description,SKU, model number, etc.).

In step 612, the first set of characters are decoded and the second setof characters are not decoded, based on the first set of charactersclassified as belonging to the specified class and the second set ofcharacters not being classified as belonging to the specified class. Insome embodiments, the second set of characters are decoded based on thefirst set of characters classified as belonging to the specified class(e.g., values 524 are decoded based on identifiers 520 being located andclassified).

A character string is generated based on decoding the first set ofcharacters. The character string can then be saved and/or transmitted(e.g., to a local device or a remote server).

In some embodiments, the process 600 further comprises analyzing theimage to detect a feature, wherein detecting the feature is based onmachine learning; dividing the image to create an image segment, wherethe image segment is based on a location of the feature in the image;the portion of the image is the image segment; classifying text, basedon features of text. (e.g., largest size within a segment); usingmachine learning for classification (e.g., identify price labels);and/or analyzing the second set of characters to classify the second setof characters as belonging to a second specified class of text, whereinthe second specified class is predefined.

In some configurations, classification includes price (e.g., the largesttext is picked), description, and/or serial number. A robot can be usedto acquire images (e.g., of a retail shelf), do label detection (e.g.,from a trained AI); segment the image into labels; classify productdescription, price (e.g., to the right, large font); identify the labelas a promotion label (e.g., by color of label; on bottom of label textspecifying “clearance” or “special”); identify a date the promotion isto expire, the promotion date; compare the promotion date to the currentdate; and/or flag promotion dates that are expired. This can be used toflag labels that need replaced.

D. Shipping Label Scanning

FIG. 7 illustrates exemplary shipping labels in multiple structures andinformation formats. In some embodiments, a shipping label could bescanned and decoded without manual configuration of the label geometry,and may permit data to be scanned from shipping labels beyond what isencoded in standardized optical patterns (e.g., GS1 barcodes). Data maybe captured from labels on inbound parcels or other deliveries whereelectronic data interchange (EDI) information is missing or doesn'tcover a logistic provider. With typical OCR and optical pattern scanningtechnology, scanning may not be feasible due to variability in shippinglabels between different logistic providers, incomplete syntax, andstructure information for unrecognized labels. For example, some datamay be provided only in alphanumeric form (e.g., mailing addresses),making it more important to read some text using OCR. In someembodiments, a system may include: (i) a barcode scanner with thecapability to scan multiple barcodes at once; (ii) OCR capabilities;and/or (iii) technological components that combine OCR and barcodescanning. As shipping labels are often scanned in conditions withoutreliable internet, a system may implement label scanning on a device,rather than as a cloud application.

In FIG. 7 , label 704 comprises a first barcode 708-1, a second barcode708-2, a third barcode 708-3, and a character code 712. In someembodiments, the system detects the barcodes 708 on label 704 andselects the largest barcode to decode (e.g., the third barcode 708-3);the other barcodes (the first barcode 708-1 and the second barcode 708-2are not decoded). In some configurations, the barcodes 708 are detectedand categorized (e.g., one-dimensional or two-dimensional; or type, suchas code 128), and then one or more barcodes 708 are decoded (e.g., thetwo-dimensional barcode, the largest one-dimensional barcode, or thesmallest one-dimensional barcode).

In some embodiments, a first code is used as an anchor for a secondcode. For example, the first barcode 708-1 could serve as an anchor todecode another code. For example, detecting the location of atwo-dimensional barcode (the first barcode 708-1) could be used todivide an image of the label 704 to define segment 716. An OCR algorithmand/or barcode decoding algorithm are then run on the segment 716. Thusthe first barcode 708-1 can be used as an anchor for the second barcode708-2 or the character code 712. In some configurations, two codes areused as an anchor for a third. For example, detecting the location ofthe first barcode 708-1 and the second barcode 708-2, and identifyingthe first barcode 708-1 as a two-dimensional barcode and the secondbarcode 708-2 as a one-dimensional barcode, can provide position and/orsize information to define segment 720. In some configurations anoutline of the label 704 is also used to provide position data forsegmenting an image. Using geometry of the label (e.g., the labeloutline and/or detected features on the label, such as barcodes,symbols, and characters) can provide information about how to segment animage. An OCR algorithm and/or a barcode decoding algorithm can then berun on the segment. Segments can help to reduce computationalcomplexity, and segments can also be used while classifying and/ormatching decoded strings to labels.

In some configurations, an address is localized (e.g., using an anchor).For example, the receiver address (“SHIP TO” on label 704) is localizedbased on being above the first barcode 708-1. The sender address canalso be localized and/or differentiated from the receiver address.

E. Other OCR Capabilities

FIG. 8 illustrates cases where a system may be configured for use-casespecific scanning combining OCR and optical pattern scanning. Use-casespecific OCR may improve the following use cases of OCR: (i)REF/LOT/Expiry marks as part of medical device labels (UDI) inhealthcare (as an application of smarter label scanning); (ii) IdentifyREF/LOT Fields as specified in UDI and parse barcodes present; (iii)Serial or Batch Number scanning (standalone or as part of SmarterProduct Label Scanning); (iv) Retail: Expiry date scanning (with orwithout combination of barcode scanning); (v) VIN Number Scanning; and(vi) Credit Card Scanning.

In some embodiments, Credit Card Scanning includes scanning data on thefront of credit and debit cards for faster data entry. Credit cardscanning may be applied to business to client payment applications(ecommerce/mobile shopping or self-scanning with payment). In someembodiments, it may be used for business to enterprise payment inphysical retail, especially for queue busting (describing a retailapproach of scanning items while customers are waiting in line toimprove the retail experience). A goal is to speed up data entry,potentially including an image of the card where required.

In some embodiments, a system may allow for entry of credit card data(native or web) through an automatic scan of the card. The system maycollect data in two ways: (i) by using a user interface (UI) provided bythe payment provider, some of these directly send the payment request tothe payment processor; and/or (ii) returning the data to the app itself,which would send the collected data in a second step to the paymentprocessor. Some of these UIs also include card scanning with the camera.Solutions where the payment details are passed to the app beforetransmission to the payment processor can be extended with a cardscanning solution such as an alternative route before sending the datato the payment processor.

In some embodiments, facial recognition is used as an anchor for OCR.For example, in image 802, facial recognition is used to identify a faceposition 804. Size and/or locations of segments 808 are based on theface position 804.

F. Flexible Pipeline

FIG. 9 illustrates various approaches to integration of OCR-opticalpattern scanning in accordance with various embodiments. In someembodiments, a keyboard wedge scan engine is used to interface withfillable form documents and/or database files to enter credit cardinformation. Credit card scanning may include, but is not limited to,applications in the following areas: (i) B2C, where credit card scanningcould be a complimentary offering to existing barcode offerings; (ii)B2E, e.g., Airlines: credit card scanning helps lower costs for B2Escanning and allow them to replace the existing additional hardware;and/or (iii) retail, potential opportunities for multiple points of sailor queue busting.

FIG. 10 illustrates a flowchart of an embodiment of a process 1000 forimage analysis using a flexible pipeline. A flexible pipeline can beused to create an engine to customize what is detected and/or classifiedin an image.

The process 1000 begins in step 1004 with training a first engine toidentify a first class of text. A second engine is trained to classify asecond class of text, step 1008. A first user is provided the firstengine for detecting the first class of text, step 1012. A second useris provided the second engine for detecting the second class of text,step 1016.

G. Rail Reporting Marks

FIG. 11 illustrates exemplary reporting marks on rail stock. Rollingstock (e.g., a railcar) in the United States is marked with uniqueidentifiers called reporting marks. Reporting marks may include analphabetic code of one to four letters and are typically painted orstenciled on each piece of rolling stock, along with a one- to six-digitnumber. Reporting marks uniquely identify an individual railcar, and areused to track the locations and movements by the railroad the railcarsare traveling on. The railroad shares the information with otherrailroads and customers and/or various types of maintenance or repairprocesses. Reporting marks are typically printed in a large font on theside of the railcar and can usually be easily read from multiple metersaway.

Reporting mark numbers can be recorded by manual entry, which iscumbersome and error prone. However, scanning reporting marks posessignificant challenges for automation. For example, fonts vary betweendifferent railcars (e.g., compare “BNSF” in a first image 1104-1 to“BNSF” in a second image 1104-2). Rolling stock may be mixed in a singletrain. Furthermore, the orientation and layout of the reporting marksmay vary, where some railway companies have the reporting marks on asingle line, while others print them on two separate lines. Furthermore,as many as hundreds of different reporting marks are in use in theUnited States. Graffiti can be painted on a railcar, sometimes hidingthe name and/or numbers partially. Font changes between the charactersand welding lines can be additional challenges.

In some embodiments, a system permits a user to scan marks on a railcarfrom a reasonable distance (e.g., 3-10 meters) and automaticallyrecognize the train reporting mark without manual data entry. A systemmay recognize the text on a partial area of the image (the “aimer”),which is not more than 200% of the size of the text in some embodiments.This partial area can contain both two-line marks as well as a longersingle-line mark. In some embodiments, the railway mark can beconsidered fixed to 4 letters. In some embodiments, strong angles aresupported, where strong angles may describe a relative perspective suchthat the text is significantly skewed. In this way, a system mayrecognize as many as 80-90 percent or more of the reporting marksencountered by the user. For example, the system is capable ofrecognizing as many as 95% or more of the human readable markings (e.g.,using techniques explained and/or referenced herein).

In some embodiments, a system is configured to identify the markings ina full uncropped video frame (“localization”) to allow automaticscanning. The capability could be implemented on an iOS or Android appor an SDK. In some embodiments, this permits a user to record rollingstock in a moving train, as opposed to manually registering everyrailcar at every stop, delivery, and/or safety check. As an illustrativeexample, manual safety checks or other registrations of a tanker thatinclude manual data entry may take place as many as eight times a day.

FIG. 12 illustrates other numbering systems on rolling stock andshipping containers. In such system, markings may be more complicatedand/or smaller, and a system may be configured to scan markings andidentify data types and data values. For example, a system may scanmarking a letter identifier printed on the car to the identificationnumber of an IoT devices that is being attached to the rail car. Inother words, the scanning is used to associate two identificationnumbers associated with the car to commission the IoT device. In someembodiments, an IoT device is an RFID tag, a GPS tracker, some form ofRF device, or an optical identifier. In another example, the shippingcontainer numbering system standard BIC (international container bureau)may be scanned by a system implementing optical pattern scanning withOCR.

FIG. 13 illustrates a flowchart of an embodiment of a process 1300 forimage analysis using visual geometry as an anchor for optical characterrecognition. Visual geometry in an image can be used as an anchor pointfor selecting an area of the image (e.g., one or more segments of theimage).

The process 1300 begins in step 1304 with detecting a location within animage having a specified geometry. The image, acquired by a camera, isreceived by the system. The image is analyzed to detect a locationwithin the image having a specified geometry. The specified geometry isa predefined, visual geometry. For example, the specified geometry couldbe a one-dimensional or two-dimensional barcode (e.g., an arrangement ofrectangles, such as squares or parallel lines), a specified class ofbarcodes, or a label.

In step 1308, the image is divided to create an image segment. The imagesegment is based on the location of the specified geometry. For example,the image segment could be based on geometry of label 704 and/or aposition of a barcode 708 as described in FIG. 7 .

In step 1312, one or more characters within the image segment aredetected and/or decoded. The image segment is analyzed to detect the oneor more characters within the image segment, and the one or morecharacters in the image segment are decoded. For example, an opticalcharacter recognition algorithm is run on the image segment and not onthe entire image.

A character string is generated based on decoding the one or morecharacters in the image segment, step 1316.

In some embodiments, the specified geometry is a symbol (e.g., symbol448 in FIG. 4 ). In some configurations, the specified geometry ismultiple lines of text extending a specified distance. For example,there are two lines of characters extending across a bottom of apassport. The geometry of those lines could be detected, an image of thepassport segmented based on a location of the two lines of characters,and OCR run on an image segment. In some configurations, geometry is alabel (e.g., boundaries of a label identifying using edge detection). Insome embodiments, the geometry is the largest or smallest font orbarcode.

FIG. 14 illustrates a flowchart of an embodiment of a process 1400 forimage analysis using machine-leaning feature detection for opticalcharacter recognition. Machine learning can identify a feature and/or asegment to OCR, and then optical character recognition can be run on thesegment.

The process 1400 begins in step 1404 with detecting a feature in animage using machine learning. The image, acquired by a camera, isreceived by the system. The image is analyzed to detect a feature in theimage, wherein detecting the feature is based on machine learning. Forexample, the feature could be a one-dimensional or two-dimensionalbarcode (e.g., an arrangement of rectangles, such as squares or parallellines), a specified class of barcodes, or a label. In step 1408, theimage is divided to create an image segment. The image segment is basedon the location of the feature in the image. For example, machinelearning is used to detect locations of barcodes 708 in FIG. 7 .

In step 1412, one or more characters within the image segment aredetected and/or decoded. The image segment is analyzed to detect the oneor more characters within the image segment, and the one or morecharacters in the image segment are decoded. For example, an opticalcharacter recognition algorithm is run on the image segment and not onthe entire image.

A character string is generated based on decoding the one or morecharacters in the image segment, step 1416.

FIG. 15 illustrates a flowchart of an embodiment of a process 1500 forimage analysis using feature detection for optical characterrecognition. Feature detection can be implemented by detecting apredefined, visual geometry (e.g., by feature extraction) or by usingmachine learning.

Process 1500 begins in step 1500 with detecting a feature in an image.Examples of a feature include, but are not limited to, a barcode, asymbol (e.g., symbol 448 in FIG. 4 ), an edge of a label, lines of text,and a face. The feature can be detected using geometric cues and/orusing machine learning. The image, acquired by a camera, is received bythe system. The image is analyzed to detect a location of a specifiedfeature within the image. The specified feature is a visual feature. Forexample, the specified feature could be a one-dimensional ortwo-dimensional barcode (e.g., an arrangement of rectangles, such assquares or parallel lines), a specified class of barcodes (e.g., numericonly, alpha-numeric, 2-dimensional, etc.), type of barcode (e.g., EAN 8,EAN 13, code 39, code 128, QR code, Aztec code, etc.), or a label.

In step 1408, the image is divided to create an image segment. The imagesegment is based on the location of the specified feature within theimage. In step 1412, one or more characters within the image segment aredetected and/or decoded. The image segment is analyzed to detect the oneor more characters within the image segment, and the one or morecharacters in the image segment are decoded. For example, an opticalcharacter recognition algorithm is run on the image segment and not onthe entire image. A character string is generated based on decoding theone or more characters in the image segment, step 1416.

FIG. 16 is a simplified block diagram of a computing device 1600.Computing device 1600 can implement some or all functions, behaviors,and/or capabilities described above that would use electronic storage orprocessing, as well as other functions, behaviors, or capabilities notexpressly described. Computing device 1600 includes a processingsubsystem 1602, a storage subsystem 1604, a user interface 1606, and/ora communication interface 1608. Computing device 1600 can also includeother components (not explicitly shown) such as a battery, powercontrollers, and other components operable to provide various enhancedcapabilities. In various embodiments, computing device 1600 can beimplemented in a desktop or laptop computer, mobile device (e.g., tabletcomputer, smart phone, mobile phone), wearable device, media device,application specific integrated circuits (ASICs), digital signalprocessors (DSPs), digital signal processing devices (DSPDs),programmable logic devices (PLDs), field programmable gate arrays(FPGAs), processors, controllers, micro-controllers, microprocessors, orelectronic units designed to perform a function or combination offunctions described above.

Storage subsystem 1604 can be implemented using a local storage and/orremovable storage medium, e.g., using disk, flash memory (e.g., securedigital card, universal serial bus flash drive), or any othernon-transitory storage medium, or a combination of media, and caninclude volatile and/or non-volatile storage media. Local storage caninclude random access memory (RAM), including dynamic RAM (DRAM), staticRAM (SRAM), or battery backed up RAM. In some embodiments, storagesubsystem 1604 can store one or more applications and/or operatingsystem programs to be executed by processing subsystem 1602, includingprograms to implement some or all operations described above that wouldbe performed using a computer. For example, storage subsystem 1604 canstore one or more code modules 1610 for implementing one or more methodsteps described above.

A firmware and/or software implementation may be implemented withmodules (e.g., procedures, functions, and so on). A machine-readablemedium tangibly embodying instructions may be used in implementingmethodologies described herein. Code modules 1610 (e.g., instructionsstored in memory) may be implemented within a processor or external tothe processor. As used herein, the term “memory” refers to a type oflong term, short term, volatile, nonvolatile, or other storage mediumand is not to be limited to any particular type of memory or number ofmemories or type of media upon which memory is stored.

Moreover, the term “storage medium” or “storage device” may representone or more memories for storing data, including read only memory (ROM),RAM, magnetic RAM, core memory, magnetic disk storage mediums, opticalstorage mediums, flash memory devices and/or other machine readablemediums for storing information. The term “machine-readable medium”includes, but is not limited to, portable or fixed storage devices,optical storage devices, wireless channels, and/or various other storagemediums capable of storing instruction(s) and/or data.

Furthermore, embodiments may be implemented by hardware, software,scripting languages, firmware, middleware, microcode, hardwaredescription languages, and/or any combination thereof. When implementedin software, firmware, middleware, scripting language, and/or microcode,program code or code segments to perform tasks may be stored in amachine readable medium such as a storage medium. A code segment (e.g.,code module 1610) or machine-executable instruction may represent aprocedure, a function, a subprogram, a program, a routine, a subroutine,a module, a software package, a script, a class, or a combination ofinstructions, data structures, and/or program statements. A code segmentmay be coupled to another code segment or a hardware circuit by passingand/or receiving information, data, arguments, parameters, and/or memorycontents. Information, arguments, parameters, data, etc. may be passed,forwarded, or transmitted by suitable means including memory sharing,message passing, token passing, network transmission, etc.

Implementation of the techniques, blocks, steps and means describedabove may be done in various ways. For example, these techniques,blocks, steps and means may be implemented in hardware, software, or acombination thereof. For a hardware implementation, the processing unitsmay be implemented within one or more ASICs, DSPs, DSPDs, PLDs, FPGAs,processors, controllers, micro-controllers, microprocessors, otherelectronic units designed to perform the functions described above,and/or a combination thereof.

Each code module 1610 may comprise sets of instructions (codes) embodiedon a computer-readable medium that directs a processor of a computingdevice 1600 to perform corresponding actions. The instructions may beconfigured to run in sequential order, in parallel (such as underdifferent processing threads), or in a combination thereof. Afterloading a code module 1610 on a general purpose computer system, thegeneral purpose computer is transformed into a special purpose computersystem.

Computer programs incorporating various features described herein (e.g.,in one or more code modules 1610) may be encoded and stored on variouscomputer readable storage media. Computer readable media encoded withthe program code may be packaged with a compatible electronic device, orthe program code may be provided separately from electronic devices(e.g., via Internet download or as a separately packagedcomputer-readable storage medium). Storage subsystem 1604 can also storeinformation useful for establishing network connections using thecommunication interface 1608.

User interface 1606 can include input devices (e.g., touch pad, touchscreen, scroll wheel, click wheel, dial, button, switch, keypad,microphone, etc.), as well as output devices (e.g., video screen,indicator lights, speakers, headphone jacks, virtual- oraugmented-reality display, etc.), together with supporting electronics(e.g., digital-to-analog or analog-to-digital converters, signalprocessors, etc.). A user can operate input devices of user interface1606 to invoke the functionality of computing device 1600 and can viewand/or hear output from computing device 1600 via output devices of userinterface 1606. For some embodiments, the user interface 1606 might notbe present (e.g., for a process using an ASIC).

Processing subsystem 1602 can be implemented as one or more processors(e.g., integrated circuits, one or more single-core or multi-coremicroprocessors, microcontrollers, central processing unit, graphicsprocessing unit, etc.). In operation, processing subsystem 1602 cancontrol the operation of computing device 1600. In some embodiments,processing subsystem 1602 can execute a variety of programs in responseto program code and can maintain multiple concurrently executingprograms or processes. At a given time, some or all of a program code tobe executed can reside in processing subsystem 1602 and/or in storagemedia, such as storage subsystem 1604. Through programming, processingsubsystem 1602 can provide various functionality for computing device1600. Processing subsystem 1602 can also execute other programs tocontrol other functions of computing device 1600, including programsthat may be stored in storage subsystem 1604.

Communication interface 1608 can provide voice and/or data communicationcapability for computing device 1600. In some embodiments, communicationinterface 1608 can include radio frequency (RF) transceiver componentsfor accessing wireless data networks (e.g., Wi-Fi network; 3G, 4G/LTE;etc.), mobile communication technologies, components for short-rangewireless communication (e.g., using Bluetooth communication standards,NFC, etc.), other components, or combinations of technologies. In someembodiments, communication interface 1608 can provide wired connectivity(e.g., universal serial bus, Ethernet, universal asynchronousreceiver/transmitter, etc.) in addition to, or in lieu of, a wirelessinterface. Communication interface 1608 can be implemented using acombination of hardware (e.g., driver circuits, antennas,modulators/demodulators, encoders/decoders, and other analog and/ordigital signal processing circuits) and software components. In someembodiments, communication interface 1608 can support multiplecommunication channels concurrently. In some embodiments thecommunication interface 1608 is not used.

It will be appreciated that computing device 1600 is illustrative andthat variations and modifications are possible. A computing device canhave various functionality not specifically described (e.g., voicecommunication via cellular telephone networks) and can includecomponents appropriate to such functionality.

Further, while the computing device 1600 is described with reference toparticular blocks, it is to be understood that these blocks are definedfor convenience of description and are not intended to imply aparticular physical arrangement of component parts. For example, theprocessing subsystem 1602, the storage subsystem, the user interface1606, and/or the communication interface 1608 can be in one device ordistributed among multiple devices.

Further, the blocks need not correspond to physically distinctcomponents. Blocks can be configured to perform various operations,e.g., by programming a processor or providing appropriate controlcircuitry, and various blocks might or might not be reconfigurabledepending on how an initial configuration is obtained. Embodiments canbe realized in a variety of apparatus including electronic devicesimplemented using a combination of circuitry and software. Electronicdevices described herein can be implemented using computing device 1600.

Various features described herein, e.g., methods, apparatus,computer-readable media and the like, can be realized using acombination of dedicated components, programmable processors, and/orother programmable devices. Processes described herein can beimplemented on the same processor or different processors. Wherecomponents are described as being configured to perform certainoperations, such configuration can be accomplished, e.g., by designingelectronic circuits to perform the operation, by programmingprogrammable electronic circuits (such as microprocessors) to performthe operation, or a combination thereof. Further, while the embodimentsdescribed above may make reference to specific hardware and softwarecomponents, those skilled in the art will appreciate that differentcombinations of hardware and/or software components may also be used andthat particular operations described as being implemented in hardwaremight be implemented in software or vice versa.

Specific details are given in the above description to provide anunderstanding of the embodiments. However, it is understood that theembodiments may be practiced without these specific details. In someinstances, well-known circuits, processes, algorithms, structures, andtechniques may be shown without unnecessary detail in order to avoidobscuring the embodiments.

While the principles of the disclosure have been described above inconnection with specific apparatus and methods, it is to be understoodthat this description is made only by way of example and not aslimitation on the scope of the disclosure. Embodiments were chosen anddescribed in order to explain principles and practical applications toenable others skilled in the art to utilize the invention in variousembodiments and with various modifications, as are suited to aparticular use contemplated. It will be appreciated that the descriptionis intended to cover modifications and equivalents.

Also, it is noted that the embodiments may be described as a processwhich is depicted as a flowchart, a flow diagram, a data flow diagram, astructure diagram, or a block diagram. Although a flowchart may describethe operations as a sequential process, many of the operations can beperformed in parallel or concurrently. In addition, the order of theoperations may be re-arranged. A process is terminated when itsoperations are completed, but could have additional steps not includedin the figure. A process may correspond to a method, a function, aprocedure, a subroutine, a subprogram, etc.

A recitation of “a”, “an”, or “the” is intended to mean “one or more”unless specifically indicated to the contrary. Patents, patentapplications, publications, and descriptions mentioned here areincorporated by reference in their entirety for all purposes. None isadmitted to be prior art.

What is claimed is:
 1. An apparatus for segmenting an image, theapparatus comprising: a camera; and one or more memory device comprisinginstructions that, when executed, cause one or more processors toperform the following steps: receiving an image acquired by the camera,wherein the image comprises a first set of characters and a second setof characters; classifying the first set of characters as an identifier;classifying the second set of characters as data associated with theidentifier; dividing the image to create an image segment, afterclassifying the first set of characters and the second set ofcharacters, wherein the image segment includes the first set ofcharacters and not the second set of characters; decoding the first setof characters in the image segment to generate a first character string;decoding the second set of characters to generate a second characterstring; and linking the first character string to the second characterstring based on classifying the first set of characters as theidentifier and the second set of characters as the data associated withthe identifier.
 2. The apparatus of claim 1, wherein the instructions,when executed, further cause the one or more processors to perform thefollowing steps: analyzing the image to identify text; and dividing theimage into one or more segments of text, based on analyzing the image toidentify text, before classifying the first set of characters and thesecond set of characters.
 3. The apparatus of claim 1, wherein thesecond character string comprises information that is also contained ina barcode in the image.
 4. A method for segmenting an image, the methodcomprising: receiving an image acquired by a camera, wherein the imagecomprises a first set of characters and a second set of characters;classifying the first set of characters as an identifier; classifyingthe second set of characters as data associated with the identifier;dividing the image to create an image segment, after classifying thefirst set of characters and the second set of characters, wherein theimage segment includes the first set of characters and not the secondset of characters; decoding the first set of characters in the imagesegment to generate a first character string; decoding the second set ofcharacters to generate a second character string; and linking the firstcharacter string to the second character string based on classifying thefirst set of characters as the identifier and the second set ofcharacters as the data associated with the identifier.
 5. The method ofclaim 4, wherein decoding the first set of characters is performed byoptical character recognition.
 6. The method of claim 4, furthercomprising: analyzing the image to identify text; and dividing the imageinto one or more segments of text, based on analyzing the image toidentify text, before classifying the first set of characters and thesecond set of characters.
 7. The method of claim 4, wherein the secondcharacter string comprises information that is also contained in abarcode in the image.
 8. The method of claim 4, further comprisingdividing the image to create additional image segments of the second setof characters, wherein creating the additional image segments isperformed after dividing the image to create the image segment.
 9. Themethod of claim 4, wherein classifying the first set of characters asthe identifier is based on proximity of the first set of characters to abarcode in the image.
 10. The method of claim 9, further comprisingdecoding the barcode.
 11. The method of claim 4, wherein classifying thefirst set of characters is based on machine learning.
 12. The method ofclaim 4, wherein the second set of characters is a serial number, astock keeping unit, or a model number.
 13. The method of claim 4,further comprising decoding the second set of characters, after decodingthe first set of characters, based on the first set of characters beingthe identifier.
 14. The method of claim 4, wherein the second set ofcharacters is a date.
 15. The method of claim 14, the method furthercomprising: comparing the date, after decoding the second set ofcharacters, to a current date; and flagging a label with the second setof characters to be replaced based on the date being expired.
 16. Amemory device comprising instructions that, when executed, cause one ormore processors to perform the following steps for segmenting an image:receiving an image acquired by a camera, wherein the image comprises afirst set of characters and a second set of characters; classifying thefirst set of characters as an identifier; classifying the second set ofcharacters as data associated with the identifier; dividing the image tocreate an image segment, after classifying the first set of charactersand the second set of characters, wherein the image segment includes thefirst set of characters and not the second set of characters; decodingthe first set of characters in the image segment to generate a firstcharacter string; decoding the second set of characters to generate asecond character string; and linking the first character string to thesecond character string based on classifying the first set of charactersas the identifier and the second set of characters as the dataassociated with the identifier.
 17. The memory device of claim 16,wherein decoding the first set of characters and is performed by opticalcharacter recognition.
 18. The memory device of claim 16, wherein thesecond character string comprises information that is also contained ina barcode in the image.
 19. The memory device of claim 18, wherein theinstructions, when executed, cause the one or more processors to performthe following step: decoding the barcode.
 20. The memory device of claim16, wherein: the second set of characters is a date; and theinstructions, when executed, cause the one or more processors to performthe following steps: comparing the date, after decoding the second setof characters, to a current date; and flagging a label with the secondset of characters to be replaced based on the date being expired.